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Abstract 

An  algebraic  method  is  presented  for  assessing  the  numbers  of  paths  through  loop- 
free  portions  of  computer  programs,  thus  providing  a  useful  measure  of  complexity. 
The  underlying  program  model  is  that  of  a  directed  graph,  though  this  does  not 
exclude  recursion.  The  algebraic  structure  is  that  of  a  path  algebra  but  with  one  axiom 
relaxed.  The  theory  has  been  implemented  within  the  Malvern  Program  Analysis  Suite 
(MALPAS).  Experience  has  shown  it  to  be  of  particular  value  in  guiding  the  analysis 
of  large  programs. 


Controller  HMSO  London 
1988 


l 


Contents 

1  INTRODUCTION 

2  THE  PATH  ASSESSOR 

3  CONCLUSION 


11 


1  INTRODUCTION 


A  recent  report  [1]  summarises  the  Malvern  Program  Analysis  Suite,  MALPAS,  a  set 
of  tools  for  the  assessment  and  verification  of  software  that  may  be  used  throughout 
development  or  post  facto.  Two  of  the  analysers  reveal  the  flows  of  control  and  information 
through  a  given  program.  The  resources  required  for  each  of  these  are  polynomial  functions 
of  the  size  of  the  program  in  terms  of  numbers  of  statements  and  declarations  of  data.  A 
further  tool,  the  Semantic  Analyser,  executes  the  program  symbolically  and  provides  a 
description  of  each  loop-free  program  path,  loops  being  exercised  precisely  once.  However, 
semantic  analysis  is  non-polynomial  in  nature.  In  particular,  the  number  of  paths  through 
a  loop-free  procedure  can  depend  exponentially  on  the  number  of  conditional  statements. 


TYPE  anytype ; 

FUNCTION  p(integer,  anytype):  boolean; 

FUNCTION  f (integer,  anytype):  anytype; 

FUNCTION  gCinteger,  anytype):  anytype; 

PROCSPEC  if fy (INOUT  x:  anytype); 

PROC  iffy; 

IF  p(l.x)  THEN  x  :-  f(l.x)  ELSE  x  :-  g(l,x)  END IF ; 
IF  p(2.x)  THEN  x  -  f (2 ,x)  ELSE  x  :-  g(2.x)  END IF ; 


IF  p(n.x)  THEN  x  :•  f(n,x)  ELSE  x  :•  g(n,x)  END IF 
ENDPROC 
FINISH 


Figure  1:  The  procedure  “iffy”  has  n  conditional  statements  in  series  and  2n  potential 
paths. 

Given  the  path-condition, 

NOT  p(l.x)  AND  p(2,  g(l.x))  AND  p(3,  f(2.  g(l,x)))  AND 

_ p(n.  f (n-1 ,  f (n-2 . 1(2,  g(l,x)))) . ). 

the  corresponding  action  is 

x  :-  f (n,  f (n-1 .  f(n-2,  .  f (2 .  g(l.x)))) . ). 

Figure  2:  A  typical  path-condition  for  the  procedure  “iffy”  and  the  corresponding 
input-output  relation. 

Figure  1,  expressed  in  MALPAS  Intermediate  Language  (IL)  [2,3]  illustrates  the  prob¬ 
lem.  The  semantic  analysis  of  the  procedure  “iffy”  yields  a  path-condition  and  consequent 
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action  for  each  path.  For  example,  the  path  for  which  all  the  p  s  are  true  on  execution, 
except  for  the  first,  is  described  algebraically  in  figure  2. 

It  may  be  seen  that  the  numbers  of  ASCII  characters  contained  in  the  path-condition 
and  action  depicted  in  figure  2  are  respectively  quadratic  and  linear  in  n.  Further,  there 
are  2"  program  paths.  So,  ignoring  possible  algebraic  simplifications,  the  volume  of  infor¬ 
mation  presented  to  the  analyst  is  of  order  n1  2".  Thus,  there  will  be  programs  for  which 
the  unfettered  application  of  semantic  analysis  is  doomed  to  failure.  No  matter  how  big 
or  how  fast  the  computer,  n  can  always  be  increased  to  defeat  it. 

Two  techniques  have  been  implemented  to  circumvent  the  problem  just  described  and 
each  involves  processing  a  given  program  prior  to  semantic  analysis.  The  Partial  Pro¬ 
grammer  (4,5) ,  given  a  subset  of  program  variables  nominated  by  the  user,  generates  a 
new  program  dedicated  to  their  specific  calculation  by  removing  irrelevant  conditional 
statements.  Reducing  n  by  1  in  figure  1  halves  the  volume  of  output  presented  to  the 
analyst.  In  practice,  more  dramatic  improvements  have  been  experienced. 

Node  marking  [6),  on  the  other  hand,  involves  partitioning  the  program  by  fixing  key 
way-points  for  retention.  Semantic  analysis  then  reveals  the  behaviour  of  the  program 
between  the  retained  nodes.  A  choice  of  dominator,  P,  for  example,  would  generate  ex¬ 
pressions  like  those  depicted  in  figure  2  for  the  (symbolic)  execution  from  the  start  to  P 
and  from  P  to  the  end,  effectively  re-introducing  an  element  of  sequential  logic.  If  P  par¬ 
titions  the  program  into  pieces  that  each  contain  rt/2  conditionals,  the  volume  of  output 
decreases  by  a  factor  21+n//I. 

In  the  light  of  experience,  it  became  clear  that  an  estimate  of  the  number  of  program 
paths  would  be  essential  if  the  analysis  of  large  programs  were  to  be  tackled.  The  purpose  of 
the  following  section  is  therefore  to  outline  the  theory  behind  the  MALPAS  Path  Assessor. 

2  THE  PATH  ASSESSOR 

Consider,  first,  an  arbitrary  loop-free  procedure,  with  a  single  start  and  single  end,  that 
may  or  may  not  call  other  procedures  including  itself.  Suppose  the  procedure  to  be  mod¬ 
elled  as  a  labelled,  directed  graph  G  whose  nodes  correspond  to  simple  statements  or 
procedure  calls  and  whose  arcs  are  labelled  with  the  distinct  elements  of  an  abstract  al¬ 
phabet  A  The  details  of  the  map  from  the  procedure  to  the  graph  need  not  concern  us 
The  point  is  that  the  number  of  paths  through  the  graph  indicates  the  number  of  syn¬ 
tactically  possible  paths  through  the  procedure  without  expanding  any  procedure  calls, 
l 

By  employing  node  reduction  [7,8],  the  graph  G  may  be  transformed  into  a  regul  .r 
expression  in  A  [9]  that  involves  only  the  operations  x  (sequence)  and  +  (alternation). 
Owing  to  the  loop-free  assumption,  *  is  omitted.  Let  the  set  of  such  restricted  expressions 
be  R(A). 

Separately,  consider  the  set  N  of  natural  numbers  {0,1,2...},  closed  under  the  arithmetic 
operations  x  and  +.  *  Then  the  algebra  {N,  x,+}  satisfies  all  but  one  of  ;,he  axioms  of 
a  path  algebra  [10],  +  not  being  idempotent.  (For  completeness,  when  applied  to  S'-  4- 

‘In  MALPAS  IL,  the  specifications  of  procedures  are  called  rather  than  their  Indies. 

JNote  that  x  and  +  are  defined  for  both  regular  expressions  and  integers. 
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is  commutative  and  associative;  x  is  associative  and  left  and  right  distributive  over  +. 
There  is  a  multiplicative  identity,  namely  1,  and  an  element,  namely  0,  that  is  both  an 
additive  identity  and  a  two-sided  multiplicative  zero  ! 

Next,  we  set  up  a  map,  PATHS,  from  /?(A)  to  N,  first  by  assigning  1  to  each  arc  in 
G  and  thereafter  by  natural  extension.  Thus,  for  each  a  in  A  and  for  each  u,  ti  in  /?(A): 


PATHS(a)  =  1 

PATHS{ uxv)  =  PAT H S(u)  x  PATH S( v)  (1) 

PATHS[u  +  v)  =  PATHS(u)  +  PATHS(v) 

Manifestly,  the  axioms  1  are  sufficient  to  guarantee  that  the  result  of  applying  PAT H S 
to  any  restricted  regular  expression  is  independent  of  the  sequence  of  evaluation,  modulo 
standard  conventions  on  removing  brackets.  Furthermore,  the  map  is  useful  in  so  far  as  it 
calculates  in  polynomial  time  3  the  number  of  paths  through  G. 

Note,  however,  that  PAT H S  is  not  a  homomorphism  from  the  restricted  algebra  of 
languages  over  A  into  iV.  An  element  of  the  former  is  a  set  of  words  with  +  being  set  union 
and  idempotent.  The  axiom  of  idempotency  is  not  preserved  by  the  map.  Nevertheless, 
because  G' s  arc  labels  are  distinct,  sub-expressions  of  the  form  s  +  s  never  appear 

Finally,  for  programs  with  loops,  the  analysis  above  may  be  applied  to  each  loop-free 
portion  of  each  procedure.  One  approach  is  to  use  node  reduction  to  generate  regular 
expressions  while  preserving  nodes  with  self-loops  as  they  are  encountered. 

3  CONCLUSION 

The  technique  of  path  assessment  just  described  was  implemented  by  RSRE  and  added 
to  the  Malvern  Program  Analysis  Suite  where  it  is  now  supported  commercially.  The 
MALPAS  Path  Assessor  has  been  used  in  many  real  applications.  It  provides  a  simple 
but  accurate  assessment  of  the  structural  complexity  of  software  and  can  indicate  where 
practical  problems  may  arise  in  attempting  to  run  the  Semantic  or  Compliance  Analysers. 

I11) 

It  is  interesting  to  note  that  the  algebra  on  which  path  assessment  is  based  is  not  a 
path  algebra,  which  seems  to  be  an  example  of  a  more  general  phenomenon  |12J. 

Finally,  it  gives  me  great  pleasure  to  thank:  J-A  Fernandez  for  his  efficient  and  accurate 
implementation  of  the  ideas  presented  in  this  paper;  Dr  J  M  Foster  for  a  correspondence 
relating  to  the  relaxation  of  algebraic  axioms;  and  H  C  Williams  for  suggestions  that  have 
added  clarity  to  the  text. 
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